Pinning artifacts for expansion of search keys and search spaces in a natural language understanding (nlu) framework

ABSTRACT

Present embodiments include an agent automation framework having an artifact pinning subsystem that pins meaning representations of a search space to enable the agent automation system to target particularly relevant candidates for improved inferences. To generate the search space, the artifact pinning subsystem may determine multiple understandings of sample utterances within intent-entity models to generate meaning representations. The sample utterances generally each belong to an identified intent that may have been labeled with a particular entity, within a structure defined by the intent-entity models. To validate the relevance of each meaning representation for an identified intent, the artifact pinning subsystem may pin meaning representations that include the particular intent and include a respective entity corresponding to the labeled entity. In addition to model-based entity pinning, the search space may also be generated with respect to a contextual intent of an on-going conversation between a user and a behavior engine.

CROSS-REFERENCES

This application claims priority from and the benefit of U.S.Provisional Application No. 62/869,811, entitled “PINNING ARTIFACTS FOREXPANSION OF SEARCH KEYS AND SEARCH SPACES IN A NATURAL LANGUAGEUNDERSTANDING (NLU) FRAMEWORK,” filed Jul. 2, 2019, which isincorporated by reference herein in its entirety for all purposes. Thisapplication is also related to U.S. Provisional Application No.62/869,864, entitled “SYSTEM AND METHOD FOR PERFORMING A MEANING SEARCHUSING A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK”; U.S.Provisional Application No. 62/869,817, entitled “PREDICTIVE SIMILARITYSCORING SUBSYSTEM IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK”;and U.S. Provisional Application No. 62/869,826, entitled “DERIVINGMULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGEUNDERSTANDING (NLU) FRAMEWORK,” which were each filed Jul. 2, 2019 andare incorporated by reference herein in their entirety for all purposes.

BACKGROUND

The present disclosure relates generally to the fields of naturallanguage understanding (NLU) and artificial intelligence (AI), and morespecifically, to an artifact pinning subsystem for NLU.

This section is intended to introduce the reader to various aspects ofart that may be related to various aspects of the present disclosure,which are described and/or claimed below. This discussion is believed tobe helpful in providing the reader with background information tofacilitate a better understanding of the various aspects of the presentdisclosure. Accordingly, it should be understood that these statementsare to be read in this light, and not as admissions of prior art.

Cloud computing relates to the sharing of computing resources that aregenerally accessed via the Internet. In particular, a cloud computinginfrastructure allows users, such as individuals and/or enterprises, toaccess a shared pool of computing resources, such as servers, storagedevices, networks, applications, and/or other computing based services.By doing so, users are able to access computing resources on demand thatare located at remote locations and these resources may be used toperform a variety computing functions (e.g., storing and/or processinglarge quantities of computing data). For enterprise and otherorganization users, cloud computing provides flexibility in accessingcloud computing resources without accruing large up-front costs, such aspurchasing expensive network equipment or investing large amounts oftime in establishing a private network infrastructure. Instead, byutilizing cloud computing resources, users are able redirect theirresources to focus on their enterprise's core functions.

Such a cloud computing service may host a virtual agent, such as a chatagent, that is designed to automatically respond to issues with theclient instance based on natural language requests from a user of theclient instance. For example, a user may provide a request to a virtualagent for assistance with a password issue, wherein the virtual agent ispart of a Natural Language Processing (NLP) or Natural LanguageUnderstanding (NLU) system. NLP is a general area of computer scienceand AI that involves some form of processing of natural language input.Examples of areas addressed by NLP include language translation, speechgeneration, parse tree extraction, part-of-speech identification, andothers. NLU is a sub-area of NLP that specifically focuses onunderstanding user utterances. Examples of areas addressed by NLUinclude question-answering (e.g., reading comprehension questions),article summarization, and others. For example, an NLU may usealgorithms to reduce human language (e.g., spoken or written) into a setof known symbols for consumption by a downstream virtual agent. NLP isgenerally used to interpret free text for further analysis. Currentapproaches to NLP are typically based on deep learning, which is a typeof AI that examines and uses patterns in data to improve theunderstanding of a program.

Certain existing virtual agents implementing NLU techniques attempt toderive meaning from a received user utterance by comparing features ofthe user utterance to a stored collection of sample utterances. Based onany matches therebetween, the virtual agents may understand a request ofa received user utterance and perform suitable actions or providesuitable replies in response to the request. In such search-basedimplementations of NLU, it may be important to compare multipleinterpretations of the user utterance to a sizeable quantity of sampleutterances to provide a widely-scoped meaning search. However,undirected expansion of user utterances and/or sample utterances toachieve this wide search scope may introduce challenges with respect toprocessing and memory resources, inference latency, precision, andconsistency during meaning derivation.

SUMMARY

A summary of certain embodiments disclosed herein is set forth below. Itshould be understood that these aspects are presented merely to providethe reader with a brief summary of these certain embodiments and thatthese aspects are not intended to limit the scope of this disclosure.Indeed, this disclosure may encompass a variety of aspects that may notbe set forth below.

Present embodiments are directed to an agent automation framework thatis designed to extract meaning from user utterances, such as requestsreceived by a virtual agent, and suitably respond to these userutterances. To perform these tasks, the agent automation frameworkincludes an NLU framework and an intent-entity model having definedintents and entities (e.g., artifacts) that are associated with sampleutterances. The NLU framework includes a meaning extraction subsystemthat is designed to generate meaning representations for the sampleutterances of the intent-entity model to construct an understandingmodel, as well as generate meaning representations for a received userutterance to construct an utterance meaning model. As noted herein, eachmeaning representation embodies a different understanding orinterpretation of an underlying utterance, whether the utterance is asample utterance or a received user utterance. Additionally, thedisclosed NLU framework includes a meaning search subsystem that isdesigned to search the meaning representations of the understandingmodel (which defines the meaning search space) to locate matches formeaning representations of the utterance meaning model (which definesthe meaning search keys). As discussed herein, the meaning searchsubsystem performs expansion of both the search space and the searchkeys, as well as targeted pinning of the search space, to providestructure-specific search participants for improved meaning derivation.As such, present embodiments generally improve search-based NLU byproviding focused expansion of the search keys and the search spaceagainst which the search keys are compared.

More specifically, present embodiments are directed to an artifactpinning subsystem that includes the meaning extraction subsystem and themeaning search subsystem mentioned above. By coordinating operation ofthe meaning extraction subsystem and the meaning search subsystem, theartifact pinning subsystem leverages relational cues provided duringuser interaction with the virtual agent (e.g., a behavior engine) and/orduring compilation of the understanding models for improved meaningsearching. For example, the artifact pinning subsystem may receivestructural information from embedded relationships within theintent-entity model, contextual information from a behavior engine (BE)guiding end-user interaction, or both, to generate a tailored searchspace against which the NLU framework compares user-utterance-basedsearch keys to more adeptly interact with and satisfy requests of theusers interfacing with the NLU framework. To generate a particularsearch space, the artifact pinning subsystem disclosed herein may firstgenerate multiple meaning representations as utterance tree structures,which provide potential representations of the various understandingsderivable from each sample utterance of the intent-entity model. In somecases, the meaning representations are generated via vocabularycleansing, vocabulary injection, and/or various part-of-speechassignments that are applied to respective tokens associated with nodesof the utterance trees. However, any suitable techniques that generatealternative meaning representations corresponding to variousunderstandings of the sample utterances, including alternative parsestructure discovery, vocabulary substitution, re-expressions, and soforth, may be implemented within the NLU framework. A significant numberof potential candidates for the search space are thus generated byconstruing each of the sample utterances to have multiple differentunderstandings, each corresponding to a respective meaningrepresentation.

Notably, to guide pruning of the potential candidates, the artifactpinning subsystem identifies that each sample utterance of theintent-entity model was annotated with artifact labels that definerelationships between the intents and the one or multiple entities ofeach sample utterance, within the structure defined by the particularintent-entity model. For example, an author of the intent-entity modelmay identify that a particular sample utterance relates to a particularintent, then label or annotate any suitable entities within therespective sample utterance that correspond to (e.g., belong to, arerelated to) the particular intent. The author may similarly labeladditional entities corresponding to additional intents of each givensample utterance. As recognized herein, the artifact pinning subsystemleverages the artifact labels of the sample utterances to prune anymeaning representations that are not valid representations of aparticular sample utterance, thereby improving a quality of a subsequentintent match by excluding non-relevant meaning representations for theintent match.

In particular, with respect to each intent of the intent-entity model,the artifact pinning subsystem may generate a set of meaningrepresentations from a respective sample utterance that include therespective intent, as well as one or multiple respective entities thatcorrespond to the labeled entity of a corresponding sample utterance. Asdiscussed in more detail below, verifying that the entities of thegenerated meaning representations align with the artifact labelsannotated in a respective intent-entity model enables the artifactpinning subsystem to efficiently prune invalid or non-relevant meaningrepresentations from consideration for improved meaning search quality.The artifact pinning subsystem may then re-express the set of meaningrepresentations by altering the arrangement or included number of nodesof utterance trees associated with the set, removing any duplicatecandidates, and finally, generating the search space based on theremaining meaning representations of the set that have the labeledentity in a proper entity format. Similarly, the artifact pinningsubsystem may form the search keys by generating multiple potentialmeaning representations of a user utterance, then, during a meaningsearch or inference, compare the search keys to the search space thatincludes the meaning representations that survive the above-mentionedmodel-based entity pinning. The meaning search may also be performedwith respect to an inferenced, contextual intent of conversation betweenthe user and a behavior engine, such that meaning representation matchesare identified based on their correspondence to the contextual intent(e.g., an intent previously inferenced by the NLU during a dialog with auser) and thereby guide targeted pruning or refinement of the searchspace. Further, during the meaning search, the artifact pinningsubsystem may increase the contribution (e.g., increase the respectivesimilarity score) of meaning representations within the search spacethat match the contextual intent to improve similarity scoring processesbased on a particular conversational situation or identified topic ofconversation. In other embodiments, the artifact pinning subsystem mayremove meaning representations that are not associated with thecontextual intent, expediting the similarity scoring processes viaimplementation of a narrower embodiment of the search space.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects of this disclosure may be better understood upon readingthe following detailed description and upon reference to the drawings inwhich:

FIG. 1 is a block diagram of an embodiment of a cloud computing systemin which embodiments of the present technique may operate;

FIG. 2 is a block diagram of an embodiment of a multi-instance cloudarchitecture in which embodiments of the present technique may operate;

FIG. 3 is a block diagram of a computing device utilized in a computingsystem that may be present in FIG. 1 or 2, in accordance with aspects ofthe present technique;

FIG. 4A is a schematic diagram illustrating an embodiment of an agentautomation framework including an NLU framework that is part of a clientinstance hosted by the cloud computing system, in accordance withaspects of the present technique;

FIG. 4B is a schematic diagram illustrating an alternative embodiment ofthe agent automation framework in which portions of the NLU frameworkare part of an enterprise instance hosted by the cloud computing system,in accordance with aspects of the present technique;

FIG. 5 is a flow diagram illustrating an embodiment of a process bywhich an agent automation framework, including an NLU framework and aBehavior Engine (BE), extracts intents and/or entities from and respondsto a user utterance, in accordance with aspects of the presenttechnique;

FIG. 6 is a block diagram illustrating an embodiment of the NLUframework including a meaning extraction subsystem and a meaning searchsubsystem, wherein the meaning extraction subsystem generates meaningrepresentations from a received user utterance to yield an utterancemeaning model and generates meaning representations from sampleutterances of an understanding model to yield an understanding model,and wherein the meaning search subsystem compares meaningrepresentations of the utterance meaning model to meaningrepresentations of the understanding model to extract artifacts (e.g.,intents and/or entities) from the received user utterance, in accordancewith aspects of the present technique;

FIG. 7 is a diagram illustrating an example of an utterance treegenerated for an utterance, in accordance with an embodiment of thepresent approach;

FIG. 8 is an information flow diagram illustrating an embodiment of anartifact pinning subsystem coordinating operation of the meaningextraction system and the meaning search subsystem, wherein the meaningextraction subsystem generates a structure-pinned search space frommultiple understanding models, and wherein the meaning search subsystemgenerates multiple meaning representations of a user utterance as asearch key to extract the artifacts from the user utterance, inaccordance with aspects of the present technique;

FIG. 9 is an information flow diagram illustrating an embodiment of theartifact pinning subsystem pinning entities within meaningrepresentations utilized to generate the search space to identifysuitable meaning representations, which include a particular intentassociated with an entity that corresponds to a particular labeledentity of the intent-entity model, in accordance with aspects of thepresent technique;

FIG. 10 is a flow diagram illustrating an embodiment of a processwhereby, during compilation of the search space, the artifact pinningsubsystem expands and narrows a number of meaning representationsgenerated for the sample utterances via model-based pinning of labeledentities within intents of the meanings representations; and

FIG. 11 is a flow diagram illustrating an embodiment of a process bywhich, during an intent match or inference, the artifact pinningsubsystem expands a number of meaning representations generated for thereceived user utterance to form search keys and compares the search keysto a compiled search space that is narrowed via intent-pinning in viewof a contextual intent derived from the BE.

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS

One or more specific embodiments will be described below. In an effortto provide a concise description of these embodiments, not all featuresof an actual implementation are described in the specification. Itshould be appreciated that in the development of any such actualimplementation, as in any engineering or design project, numerousimplementation-specific decisions must be made to achieve thedevelopers' specific goals, such as compliance with system-related andbusiness-related constraints, which may vary from one implementation toanother. Moreover, it should be appreciated that such a developmenteffort might be complex and time consuming, but would nevertheless be aroutine undertaking of design, fabrication, and manufacture for those ofordinary skill having the benefit of this disclosure.

As used herein, the term “computing system” or “computing device” refersto an electronic computing device such as, but not limited to, a singlecomputer, virtual machine, virtual container, host, server, laptop,and/or mobile device, or to a plurality of electronic computing devicesworking together to perform the function described as being performed onor by the computing system. As used herein, the term “machine-readablemedium” may include a single medium or multiple media (e.g., acentralized or distributed database, and/or associated caches andservers) that store one or more instructions or data structures. Theterm “non-transitory machine-readable medium” shall also be taken toinclude any tangible medium that is capable of storing, encoding, orcarrying instructions for execution by the computing system and thatcause the computing system to perform any one or more of themethodologies of the present subject matter, or that is capable ofstoring, encoding, or carrying data structures utilized by or associatedwith such instructions. The term “non-transitory machine-readablemedium” shall accordingly be taken to include, but not be limited to,solid-state memories, and optical and magnetic media. Specific examplesof non-transitory machine-readable media include, but are not limitedto, non-volatile memory, including by way of example, semiconductormemory devices (e.g., Erasable Programmable Read-Only Memory (EPROM),Electrically Erasable Programmable Read-Only Memory (EEPROM), and flashmemory devices), magnetic disks such as internal hard disks andremovable disks, magneto-optical disks, and CD-ROM and DVD-ROM disks.

As used herein, the terms “application,” “engine,” and “plug-in” referto one or more sets of computer software instructions (e.g., computerprograms and/or scripts) executable by one or more processors of acomputing system to provide particular functionality. Computer softwareinstructions can be written in any suitable programming languages, suchas C, C++, C #, Pascal, Fortran, Perl, MATLAB, SAS, SPSS, JavaScript,AJAX, and JAVA. Such computer software instructions can comprise anindependent application with data input and data display modules.Alternatively, the disclosed computer software instructions can beclasses that are instantiated as distributed objects. The disclosedcomputer software instructions can also be component software, forexample JAVABEANS or ENTERPRISE JAVABEANS. Additionally, the disclosedapplications or engines can be implemented in computer software,computer hardware, or a combination thereof.

As used herein, the term “framework” refers to a system of applicationsand/or engines, as well as any other supporting data structures,libraries, modules, and any other supporting functionality, thatcooperate to perform one or more overall functions. In particular, a“natural language understanding framework” or “NLU framework” comprisesa collection of computer programs designed to process and derive meaning(e.g., intents, entities, artifacts) from natural language utterancesbased on an understanding model. As used herein, a “behavior engine” or“BE,” also known as a reasoning agent or RA/BE, refers to a rule-basedagent, such as a virtual agent, designed to interact with users based ona conversation model. For example, a “virtual agent” may refer to aparticular example of a BE that is designed to interact with users vianatural language requests in a particular conversational orcommunication channel. With this in mind, the terms “virtual agent” and“BE” are used interchangeably herein. By way of specific example, avirtual agent may be or include a chat agent that interacts with usersvia natural language requests and responses in a chat room environment.Other examples of virtual agents may include an email agent, a forumagent, a ticketing agent, a telephone call agent, and so forth, whichinteract with users in the format of email, forum posts, autoreplies toservice tickets, phone calls, and so forth.

As used herein, an “intent” refers to a desire or goal of a user whichmay relate to an underlying purpose of a communication, such as anutterance. As used herein, an “entity” refers to an object, subject, orsome other parameterization of an intent. It is noted that, for presentembodiments, certain entities are treated as parameters of acorresponding intent. More specifically, certain entities (e.g., timeand location) may be globally recognized and extracted for all intents,while other entities are intent-specific (e.g., merchandise entitiesassociated with purchase intents) and are generally extracted only whenfound within the intents that define them. As used herein, “artifact”collectively refers to both intents and entities of an utterance. Asused herein, an “understanding model” is a collection of models used bythe NLU framework to infer meaning of natural language utterances. Anunderstanding model may include a vocabulary model that associatescertain tokens (e.g., words or phrases) with particular word vectors, anintent-entity model, an entity model, or a combination thereof. As usedherein an “intent-entity model” refers to a model that associatesparticular intents with particular sample utterances, wherein entitiesassociated with the intent may be encoded as a parameter of the intentwithin the sample utterances of the model. As used herein, the term“agents” may refer to computer-generated personas (e.g., chat agents orother virtual agents) that interact with users within a conversationalchannel. As used herein, a “corpus” refers to a captured body of sourcedata that includes interactions between various users and virtualagents, wherein the interactions include communications or conversationswithin one or more suitable types of media (e.g., a help line, a chatroom or message string, an email string). As used herein, an “utterancetree” refers to a data structure that stores a meaning representation ofan utterance. As discussed, an utterance tree has a tree structure(e.g., a dependency parse tree structure) that represents the syntacticand grammatical structure of the utterance (e.g., relationships betweenwords, part-of-speech (POS) taggings), wherein nodes of the treestructure store vectors (e.g., word vectors, subtree vectors) thatencode the semantic meaning of the utterance. As used herein, a“quality” of an inference or meaning search refers to a quantitativemeasure based on one or multiple of an accuracy, a precision, and/or anysuitable F-score of the inference, as would be understood by one ofordinary skill in the art of NLU.

As used herein, “source data” or “conversation logs” may include anysuitable captured interactions between various agents and users,including but not limited to, chat logs, email strings, documents, helpdocumentation, frequently asked questions (FAQs), forum entries, itemsin support ticketing, recordings of help line calls, and so forth. Asused herein, an “utterance” refers to a single natural languagestatement made by a user or agent that may include one or more intents.As such, an utterance may be part of a previously captured corpus ofsource data, and an utterance may also be a new statement received froma user as part of an interaction with a virtual agent. As used herein,“machine learning” or “ML” may be used to refer to any suitablestatistical form of artificial intelligence capable of being trainedusing machine learning techniques, including supervised, unsupervised,and semi-supervised learning techniques. For example, in certainembodiments, ML techniques may be implemented using a neural network(NN) (e.g., a deep neural network (DNN), a recurrent neural network(RNN), a recursive neural network). As used herein, a “vector” (e.g., aword vector, an intent vector, a subject vector, a subtree vector)refers to a linear algebra vector that is an ordered n-dimensional list(e.g., a 300 dimensional list) of floating point values (e.g., a 1×N oran N×1 matrix) that provides a mathematical representation of thesemantic meaning of a portion (e.g., a word or phrase, an intent, anentity, a token) of an utterance.

As used herein, the terms “dialog” and “conversation” refer to anexchange of utterances between a user and a virtual agent over a periodof time (e.g., a day, a week, a month, a year, etc.). As used herein, an“episode” refers to distinct portions of dialog that may be delineatedfrom one another based on a change in topic, a substantial delay betweencommunications, or other factors. As used herein, “context” refers toinformation associated with an episode of a conversation that can beused by the BE to determine suitable actions in response to extractedintents and/or entities of a user utterance. As used herein, a“contextual intent” refers to an intent that was previously identifiedor processed by the NLU framework during a flow or conversation betweenthe BE and the user. As used herein, “domain specificity” refers to howattuned a system is to correctly extracting intents and entitiesexpressed in actual conversations in a given domain and/orconversational channel. As used herein, an “understanding” of anutterance refers to an interpretation or a construction of the utteranceby the NLU framework. As such, it may be appreciated that differentunderstandings of an utterance are generally associated with differentmeaning representations having different structures (e.g., differentnodes, different relationships between nodes), different POS taggings,and so forth.

As mentioned, a computing platform may include a chat agent, or anothersimilar virtual agent, that is designed to automatically respond to userrequests to perform functions or address issues on the platform via NLUtechniques. The disclosed NLU framework is based on principles ofcognitive construction grammar (CCG), in which an aspect of the meaningof a natural language utterance can be determined based on the form(e.g., syntactic structure, shape) and semantic meaning of theutterance. The disclosed NLU framework is capable of generating multiplemeaning representations that form one or more search keys for anutterance. Additionally, the disclosed NLU framework is capable ofgenerating an understanding model having multiple meaningrepresentations for certain sample utterances, which expands the searchspace for meaning search, thereby improving operation of the NLUframework. However, when attempting to derive user intent from naturallanguage utterances by comparing the search keys to the search space, itis presently recognized that certain NLU frameworks may performinefficient searches when considering search keys and search spaces thatinclude a large number of meaning representations, potentially returningirrelevant intent matches and/or incurring undesirable inference latencythat reduces user satisfaction with the certain NLU frameworks.

Accordingly, present embodiments are generally directed toward an agentautomation framework capable of leveraging CCG techniques to generatemultiple meaning representations for utterances, including sampleutterances in the intent-entity model and utterances received from auser. In particular, an artifact pinning subsystem of the agentautomation system directs a meaning extraction subsystem and a meaningsearch subsystem of the agent automation subsystem during bothcompilation of a search space, as well as during inference of a receiveduser utterance that is transformed into one or more search keys andcompared to the search space. During generation of the search space, theartifact pinning subsystem may determine multiple differentunderstandings of sample utterances within one or multiple intent-entitymodels by performing vocabulary adjustment, varied part-of-speechassignment, and/or any other suitable processes that generate multiplemeaning representations corresponding to various understandings of eachsample utterance. As such, the artifact pinning subsystem thus generatesa potentially-sizeable quantity of candidates for inclusion within thesearch space.

To selectively prune the candidates, the artifact pinning subsystemdisclosed herein leverages artifact correlations of the sampleutterances to prune any meaning representations that are not validrepresentations of a particular sample utterance. That is, the sampleutterances generally each belong to an identified intent that may havebeen labeled (e.g., by an author or ML-based annotation subsystem) withany suitable number of corresponding entities, within a structure or setof relationships defined by the intent-entity model. To validate therelevance of each candidate meaning representation for an identifiedintent, the artifact pinning subsystem may identify and pin a set ofmeaning representations that include the particular intent and includeone or multiple respective entities corresponding to one or multiplelabeled entities of a corresponding sample utterance. That is, themeaning representations of the set are desirably retained (e.g., pinned)as valid formulations of an associated sample utterance, within thestructure defined by artifact labels of a respective intent-entitymodel. The set of meaning representations may then be re-expressed orexpanded via any suitable processes, such as altering the arrangement orincluded number of nodes of meaning representations (e.g., utterancetrees) associated with the set. The artifact pinning subsystem ofcertain embodiments may therefore remove any duplicate candidates andgenerate (e.g., compile) the search space based on the remaining meaningrepresentations with the appropriate pinned entity.

During a meaning search performed to derive meaning from an on-goingconversation between a user and a behavior engine, the artifact pinningsubsystem may form the one or more search keys to compare against thesearch space by generating multiple potential meaning representations ofa user utterance. Notably, the artifact pinning subsystem may pin thesearch space with respect to an inferenced, contextual intent of theon-going conversation, guiding further targeted pruning of the searchspace that may otherwise decrease a quality or increase a resourcedemand of a meaning search. For example, the artifact pinning subsystemmay identify relevant meaning representations within the search space orthe underlying understanding model to provide a similarity scoring bonusto the relevant meaning representations or to prune other, non-relevantmeaning representations from the search space, thereby providing moredirect search paths for the meaning searches. As will be understood, theherein-disclosed pinning of the multiple various candidates of thesearch space enables the agent automation system to targetparticularly-relevant candidates for improved meaning search quality.Pruning the search space to these candidates may also limit computingresource usage and improve the efficiency of the NLU framework.

With the preceding in mind, the following figures relate to varioustypes of generalized system architectures or configurations that may beemployed to provide services to an organization in a multi-instanceframework and on which the present approaches may be employed.Correspondingly, these system and platform examples may also relate tosystems and platforms on which the techniques discussed herein may beimplemented or otherwise utilized. Turning now to FIG. 1, a schematicdiagram of an embodiment of a cloud computing system 10 whereembodiments of the present disclosure may operate, is illustrated. Thecloud computing system 10 may include a client network 12, a network 18(e.g., the Internet), and a cloud-based platform 20. In someimplementations, the cloud-based platform 20 may be a configurationmanagement database (CMDB) platform. In one embodiment, the clientnetwork 12 may be a local private network, such as local area network(LAN) having a variety of network devices that include, but are notlimited to, switches, servers, and routers. In another embodiment, theclient network 12 represents an enterprise network that could includeone or more LANs, virtual networks, data centers 22, and/or other remotenetworks. As shown in FIG. 1, the client network 12 is able to connectto one or more client devices 14A, 14B, and 14C so that the clientdevices are able to communicate with each other and/or with the networkhosting the platform 20. The client devices 14 may be computing systemsand/or other types of computing devices generally referred to asInternet of Things (IoT) devices that access cloud computing services,for example, via a web browser application or via an edge device 16 thatmay act as a gateway between the client devices 14 and the platform 20.FIG. 1 also illustrates that the client network 12 includes anadministration or managerial device, agent, or server, such as amanagement, instrumentation, and discovery (MID) server 17 thatfacilitates communication of data between the network hosting theplatform 20, other external applications, data sources, and services,and the client network 12. Although not specifically illustrated in FIG.1, the client network 12 may also include a connecting network device(e.g., a gateway or router) or a combination of devices that implement acustomer firewall or intrusion protection system.

For the illustrated embodiment, FIG. 1 illustrates that client network12 is coupled to a network 18. The network 18 may include one or morecomputing networks, such as other LANs, wide area networks (WAN), theInternet, and/or other remote networks, to transfer data between theclient devices 14A-C and the network hosting the platform 20. Each ofthe computing networks within network 18 may contain wired and/orwireless programmable devices that operate in the electrical and/oroptical domain. For example, network 18 may include wireless networks,such as cellular networks (e.g., Global System for Mobile Communications(GSM) based cellular network), IEEE 802.11 networks, and/or othersuitable radio-based networks. The network 18 may also employ any numberof network communication protocols, such as Transmission ControlProtocol (TCP) and Internet Protocol (IP). Although not explicitly shownin FIG. 1, network 18 may include a variety of network devices, such asservers, routers, network switches, and/or other network hardwaredevices configured to transport data over the network 18.

In FIG. 1, the network hosting the platform 20 may be a remote network(e.g., a cloud network) that is able to communicate with the clientdevices 14 via the client network 12 and network 18. The network hostingthe platform 20 provides additional computing resources to the clientdevices 14 and/or the client network 12. For example, by utilizing thenetwork hosting the platform 20, users of the client devices 14 are ableto build and execute applications for various enterprise, IT, and/orother organization-related functions. In one embodiment, the networkhosting the platform 20 is implemented on the one or more data centers22, where each data center could correspond to a different geographiclocation. Each of the data centers 22 includes a plurality of virtualservers 24 (also referred to herein as application nodes, applicationservers, virtual server instances, application instances, or applicationserver instances), where each virtual server 24 can be implemented on aphysical computing system, such as a single electronic computing device(e.g., a single physical hardware server) or across multiple-computingdevices (e.g., multiple physical hardware servers). Examples of virtualservers 24 include, but are not limited to a web server (e.g., a unitaryApache installation), an application server (e.g., unitary JAVA VirtualMachine), and/or a database server (e.g., a unitary relational databasemanagement system (RDBMS) catalog).

To utilize computing resources within the platform 20, network operatorsmay choose to configure the data centers 22 using a variety of computinginfrastructures. In one embodiment, one or more of the data centers 22are configured using a multi-tenant cloud architecture, such that one ofthe server instances 24 handles requests from and serves multiplecustomers. Data centers 22 with multi-tenant cloud architecturecommingle and store data from multiple customers, where multiplecustomer instances are assigned to one of the virtual servers 24. In amulti-tenant cloud architecture, the particular virtual server 24distinguishes between and segregates data and other information of thevarious customers. For example, a multi-tenant cloud architecture couldassign a particular identifier for each customer in order to identifyand segregate the data from each customer. Generally, implementing amulti-tenant cloud architecture may suffer from various drawbacks, suchas a failure of a particular one of the server instances 24 causingoutages for all customers allocated to the particular server instance.

In another embodiment, one or more of the data centers 22 are configuredusing a multi-instance cloud architecture to provide every customer itsown unique customer instance or instances. For example, a multi-instancecloud architecture could provide each customer instance with its owndedicated application server and dedicated database server. In otherexamples, the multi-instance cloud architecture could deploy a singlephysical or virtual server 24 and/or other combinations of physicaland/or virtual servers 24, such as one or more dedicated web servers,one or more dedicated application servers, and one or more databaseservers, for each customer instance. In a multi-instance cloudarchitecture, multiple customer instances could be installed on one ormore respective hardware servers, where each customer instance isallocated certain portions of the physical server resources, such ascomputing memory, storage, and processing power. By doing so, eachcustomer instance has its own unique software stack that provides thebenefit of data isolation, relatively less downtime for customers toaccess the platform 20, and customer-driven upgrade schedules. Anexample of implementing a customer instance within a multi-instancecloud architecture will be discussed in more detail below with referenceto FIG. 2.

FIG. 2 is a schematic diagram of an embodiment of a multi-instance cloudarchitecture 40 where embodiments of the present disclosure may operate.FIG. 2 illustrates that the multi-instance cloud architecture 40includes the client network 12 and the network 18 that connect to two(e.g., paired) data centers 22A and 22B that may be geographicallyseparated from one another. Using FIG. 2 as an example, networkenvironment and service provider cloud infrastructure client instance 42(also referred to herein as a client instance 42) is associated with(e.g., supported and enabled by) dedicated virtual servers (e.g.,virtual servers 24A, 24B, 24C, and 24D) and dedicated database servers(e.g., virtual database servers 44A and 44B). Stated another way, thevirtual servers 24A-24D and virtual database servers 44A and 44B are notshared with other client instances and are specific to the respectiveclient instance 42. In the depicted example, to facilitate availabilityof the client instance 42, the virtual servers 24A-24D and virtualdatabase servers 44A and 44B are allocated to two different data centers22A and 22B so that one of the data centers 22 acts as a backup datacenter. Other embodiments of the multi-instance cloud architecture 40could include other types of dedicated virtual servers, such as a webserver. For example, the client instance 42 could be associated with(e.g., supported and enabled by) the dedicated virtual servers 24A-24D,dedicated virtual database servers 44A and 44B, and additional dedicatedvirtual web servers (not shown in FIG. 2).

Although FIGS. 1 and 2 illustrate specific embodiments of a cloudcomputing system 10 and a multi-instance cloud architecture 40,respectively, the disclosure is not limited to the specific embodimentsillustrated in FIGS. 1 and 2. For instance, although FIG. 1 illustratesthat the platform 20 is implemented using data centers, otherembodiments of the platform 20 are not limited to data centers and canutilize other types of remote network infrastructures. Moreover, otherembodiments of the present disclosure may combine one or more differentvirtual servers into a single virtual server or, conversely, performoperations attributed to a single virtual server using multiple virtualservers. For instance, using FIG. 2 as an example, the virtual servers24A, 24B, 24C, 24D and virtual database servers 44A, 44B may be combinedinto a single virtual server. Moreover, the present approaches may beimplemented in other architectures or configurations, including, but notlimited to, multi-tenant architectures, generalized client/serverimplementations, and/or even on a single physical processor-based deviceconfigured to perform some or all of the operations discussed herein.Similarly, though virtual servers or machines may be referenced tofacilitate discussion of an implementation, physical servers may insteadbe employed as appropriate. The use and discussion of FIGS. 1 and 2 areonly examples to facilitate ease of description and explanation and arenot intended to limit the disclosure to the specific examplesillustrated therein.

As may be appreciated, the respective architectures and frameworksdiscussed with respect to FIGS. 1 and 2 incorporate computing systems ofvarious types (e.g., servers, workstations, client devices, laptops,tablet computers, cellular telephones, and so forth) throughout. For thesake of completeness, a brief, high level overview of componentstypically found in such systems is provided. As may be appreciated, thepresent overview is intended to merely provide a high-level, generalizedview of components typical in such computing systems and should not beviewed as limiting in terms of components discussed or omitted fromdiscussion.

By way of background, it may be appreciated that the present approachmay be implemented using one or more processor-based systems such asshown in FIG. 3. Likewise, applications and/or databases utilized in thepresent approach may be stored, employed, and/or maintained on suchprocessor-based systems. As may be appreciated, such systems as shown inFIG. 3 may be present in a distributed computing environment, anetworked environment, or other multi-computer platform or architecture.Likewise, systems such as that shown in FIG. 3, may be used insupporting or communicating with one or more virtual environments orcomputational instances on which the present approach may beimplemented.

With this in mind, an example computer system may include some or all ofthe computer components depicted in FIG. 3. FIG. 3 generally illustratesa block diagram of example components of a computing system 80 and theirpotential interconnections or communication paths, such as along one ormore busses. As illustrated, the computing system 80 may include varioushardware components such as, but not limited to, one or more processors82, one or more busses 84, memory 86, input devices 88, a power source90, a network interface 92, a user interface 94, and/or other computercomponents useful in performing the functions described herein.

The one or more processors 82 may include one or more microprocessorscapable of performing instructions stored in the memory 86. Additionallyor alternatively, the one or more processors 82 may includeapplication-specific integrated circuits (ASICs), field-programmablegate arrays (FPGAs), and/or other devices designed to perform some orall of the functions discussed herein without calling instructions fromthe memory 86.

With respect to other components, the one or more busses 84 includesuitable electrical channels to provide data and/or power between thevarious components of the computing system 80. The memory 86 may includeany tangible, non-transitory, and computer-readable storage media.Although shown as a single block in FIG. 1, the memory 86 can beimplemented using multiple physical units of the same or different typesin one or more physical locations. The input devices 88 correspond tostructures to input data and/or commands to the one or more processors82. For example, the input devices 88 may include a mouse, touchpad,touchscreen, keyboard and the like. The power source 90 can be anysuitable source for power of the various components of the computingdevice 80, such as line power and/or a battery source. The networkinterface 92 includes one or more transceivers capable of communicatingwith other devices over one or more networks (e.g., a communicationchannel). The network interface 92 may provide a wired network interfaceor a wireless network interface. A user interface 94 may include adisplay that is configured to display text or images transferred to itfrom the one or more processors 82. In addition and/or alternative tothe display, the user interface 94 may include other devices forinterfacing with a user, such as lights (e.g., LEDs), speakers, and thelike.

It should be appreciated that the cloud-based platform 20 discussedabove provides an example of an architecture that may utilize NLUtechnologies. In particular, the cloud-based platform 20 may include orstore a large corpus of source data that can be mined, to facilitate thegeneration of a number of outputs, including an intent-entity model. Forexample, the cloud-based platform 20 may include ticketing source datahaving requests for changes or repairs to particular systems, dialogbetween the requester and a service technician or an administratorattempting to address an issue, a description of how the ticket waseventually resolved, and so forth. Then, the generated intent-entitymodel can serve as a basis for classifying intents in future requests,and can be used to generate and improve a conversational model tosupport a virtual agent that can automatically address future issueswithin the cloud-based platform 20 based on natural language requestsfrom users. As such, in certain embodiments described herein, thedisclosed agent automation framework is incorporated into thecloud-based platform 20, while in other embodiments, the agentautomation framework may be hosted and executed (separately from thecloud-based platform 20) by a suitable system that is communicativelycoupled to the cloud-based platform 20 to process utterances, asdiscussed below.

With the foregoing in mind, FIG. 4A illustrates an agent automationframework 100 (also referred to herein as an agent automation system100) associated with a client instance 42. More specifically, FIG. 4Aillustrates an example of a portion of a service provider cloudinfrastructure, including the cloud-based platform 20 discussed above.The cloud-based platform 20 is connected to a client device 14D via thenetwork 18 to provide a user interface to network applications executingwithin the client instance 42 (e.g., via a web browser of the clientdevice 14D). Client instance 42 is supported by virtual servers similarto those explained with respect to FIG. 2, and is illustrated here toshow support for the disclosed functionality described herein within theclient instance 42. The cloud provider infrastructure is generallyconfigured to support a plurality of end-user devices, such as clientdevice 14D, concurrently, wherein each end-user device is incommunication with the single client instance 42. Also, the cloudprovider infrastructure may be configured to support any number ofclient instances, such as client instance 42, concurrently, with each ofthe instances in communication with one or more end-user devices. Asmentioned above, an end-user may also interface with client instance 42using an application that is executed within a web browser.

The embodiment of the agent automation framework 100 illustrated in FIG.4A includes a behavior engine (BE) 102, an NLU framework 104, and adatabase 106, which are communicatively coupled within the clientinstance 42. The BE 102 may host or include any suitable number ofvirtual agents or personas that interact with the user of the clientdevice 14D via natural language user requests 122 (also referred toherein as user utterances 122 or utterances 122) and agent responses 124(also referred to herein as agent utterances 124). It may be noted that,in actual implementations, the agent automation framework 100 mayinclude a number of other suitable components, including the meaningextraction subsystem, the meaning search subsystem, and so forth, inaccordance with the present disclosure.

For the embodiment illustrated in FIG. 4A, the database 106 may be adatabase server instance (e.g., database server instance 44A or 44B, asdiscussed with respect to FIG. 2), or a collection of database serverinstances. The illustrated database 106 stores an intent-entity model108, a conversation model 110, a corpus of utterances 112, and acollection of rules 114 in one or more tables (e.g., relational databasetables) of the database 106. The intent-entity model 108 storesassociations or relationships between particular intents and particularentities via particular sample utterances. In certain embodiments, theintent-entity model 108 may be authored by a designer using a suitableauthoring tool. In other embodiments, the agent automation framework 100generates the intent-entity model 108 from the corpus of utterances 112and the collection of rules 114 stored in one or more tables of thedatabase 106. The intent-entity model 108 may also be determined basedon a combination of authored and ML techniques, in some embodiments. Inany case, it should be understood that the disclosed intent-entity model108 may associate any suitable combination of intents and/or entitieswith respective ones of the corpus of utterances 112. For embodimentsdiscussed below, sample utterances of the intent-entity model 108 areused to generate meaning representations of an understanding model todefine the search space for a meaning search.

For the embodiment illustrated in FIG. 4A, the conversation model 110stores associations between intents of the intent-entity model 108 andparticular responses and/or actions, which generally define the behaviorof the BE 102. In certain embodiments, at least a portion of theassociations within the conversation model are manually created orpredefined by a designer of the BE 102 based on how the designer wantsthe BE 102 to respond to particular identified artifacts in processedutterances. It should be noted that, in different embodiments, thedatabase 106 may include other database tables storing other informationrelated to intent classification, such as a tables storing informationregarding compilation model template data (e.g., class compatibilityrules, class-level scoring coefficients, tree-model comparisonalgorithms, tree substructure vectorization algorithms), meaningrepresentations, and so forth.

For the illustrated embodiment, the NLU framework 104 includes an NLUengine 116 and a vocabulary manager 118. It may be appreciated that theNLU framework 104 may include any suitable number of other components.In certain embodiments, the NLU engine 116 is designed to perform anumber of functions of the NLU framework 104, including generating wordvectors (e.g., intent vectors, subject or entity vectors, subtreevectors) from word or phrases of utterances, as well as determiningdistances (e.g., Euclidean distances) between these vectors. Forexample, the NLU engine 116 is generally capable of producing arespective intent vector for each intent of an analyzed utterance. Assuch, a similarity measure or distance between two different utterancescan be calculated using the respective intent vectors produced by theNLU engine 116 for the two intents, wherein the similarity measureprovides an indication of similarity in meaning between the two intents.

The vocabulary manager 118 addresses out-of-vocabulary words and symbolsthat were not encountered by the NLU framework 104 during vocabularytraining. For example, in certain embodiments, the vocabulary manager118 can identify and replace synonyms and domain-specific meanings ofwords and acronyms within utterances analyzed by the agent automationframework 100 (e.g., based on the collection of rules 114), which canimprove the performance of the NLU framework 104 to properly identifyintents and entities within context-specific utterances. Additionally,to accommodate the tendency of natural language to adopt new usages forpre-existing words, in certain embodiments, the vocabulary manager 118handles repurposing of words previously associated with other intents orentities based on a change in context. For example, the vocabularymanager 118 could handle a situation in which, in the context ofutterances from a particular client instance and/or conversationchannel, the word “bike” actually refers to a motorcycle rather than abicycle.

Once the intent-entity model 108 and the conversation model 110 havebeen created, the agent automation framework 100 is designed to receivea user utterance 122 (in the form of a natural language request) and toappropriately take action to address the request. For example, for theembodiment illustrated in FIG. 4A, the BE 102 is a virtual agent thatreceives, via the network 18, the utterance 122 (e.g., a naturallanguage request in a chat communication) submitted by the client device14D disposed on the client network 12. The BE 102 provides the utterance122 to the NLU framework 104, and the NLU engine 116, along with thevarious subsystems of the NLU framework discussed below, processes theutterance 122 based on the intent-entity model 108 to derive artifacts(e.g., intents and/or entities) within the utterance. Based on theartifacts derived by the NLU engine 116, as well as the associationswithin the conversation model 110, the BE 102 performs one or moreparticular predefined actions. For the illustrated embodiment, the BE102 also provides a response 124 (e.g., a virtual agent utterance 124 orconfirmation) to the client device 14D via the network 18, for example,indicating actions performed by the BE 102 in response to the receiveduser utterance 122. Additionally, in certain embodiments, the utterance122 may be added to the utterances 112 stored in the database 106 forcontinued learning within the NLU framework 104.

It may be appreciated that, in other embodiments, one or more componentsof the agent automation framework 100 and/or the NLU framework 104 maybe otherwise arranged, situated, or hosted for improved performance. Forexample, in certain embodiments, one or more portions of the NLUframework 104 may be hosted by an instance (e.g., a shared instance, anenterprise instance) that is separate from, and communicatively coupledto, the client instance 42. It is presently recognized that suchembodiments can advantageously reduce the size of the client instance42, improving the efficiency of the cloud-based platform 20. Inparticular, in certain embodiments, one or more components of theartifact pinning subsystem discussed below may be hosted by a separateinstance (e.g., an enterprise instance) that is communicatively coupledto the client instance 42, as well as other client instances, to enableimproved meaning searching for suitable matching meaning representationswithin the search space to enable identification of artifact matches forthe utterance 122.

With the foregoing in mind, FIG. 4B illustrates an alternativeembodiment of the agent automation framework 100 in which portions ofthe NLU framework 104 are instead executed by a separate, sharedinstance (e.g., enterprise instance 125) that is hosted by thecloud-based platform 20. The illustrated enterprise instance 125 iscommunicatively coupled to exchange data related to artifact mining andclassification with any suitable number of client instances via asuitable protocol (e.g., via suitable Representational State Transfer(REST) requests/responses). As such, for the design illustrated in FIG.4B, by hosting a portion of the NLU framework as a shared resourceaccessible to multiple client instances 42, the size of the clientinstance 42 can be substantially reduced (e.g., compared to theembodiment of the agent automation framework 100 illustrated in FIG. 4A)and the overall efficiency of the agent automation framework 100 can beimproved.

In particular, the NLU framework 104 illustrated in FIG. 4B is dividedinto three distinct components that perform distinct processes withinthe NLU framework 104. These components include: a shared NLU trainer126 hosted by the enterprise instance 125, a shared NLU annotator 127hosted by the enterprise instance 125, and an NLU predictor 128 hostedby the client instance 42. It may be appreciated that the organizationsillustrated in FIGS. 4A and 4B are merely examples, and in otherembodiments, other organizations of the NLU framework 104 and/or theagent automation framework 100 may be used, in accordance with thepresent disclosure.

For the embodiment of the agent automation framework 100 illustrated inFIG. 4B, the shared NLU trainer 126 is designed to receive the corpus ofutterances 112 from the client instance 42, and to perform semanticmining (e.g., including semantic parsing, grammar engineering, and soforth) to facilitate generation of the intent-entity model 108. Once theintent-entity model 108 has been generated, when the BE 102 receives theuser utterance 122 provided by the client device 14D, the NLU predictor128 passes the utterance 122 and the intent-entity model 108 to theshared NLU annotator 127 for parsing and annotation of the utterance122. The shared NLU annotator 127 performs semantic parsing, grammarengineering, and so forth, of the utterance 122 based on theintent-entity model 108 and returns annotated utterance trees of theutterance 122 to the NLU predictor 128 of client instance 42. The NLUpredictor 128 then uses these annotated structures of the utterance 122,discussed below in greater detail, to identify matching intents from theintent-entity model 108, such that the BE 102 can perform one or moreactions based on the identified intents. It may be appreciated that theshared NLU annotator 127 may correspond to the meaning extractionsubsystem 150, and the NLU predictor may correspond to the meaningsearch subsystem 152, of the NLU framework 104, as discussed below.

FIG. 5 is a flow diagram depicting a process 145 by which the behaviorengine (BE) 102 and NLU framework 104 perform respective roles within anembodiment of the agent automation framework 100. For the illustratedembodiment, the NLU framework 104 processes a received user utterance122 to extract artifacts 140 (e.g., intents and/or entities) based onthe intent-entity model 108. The extracted artifacts 140 may beimplemented as a collection of symbols that represent intents andentities of the user utterance 122 in a form that is consumable by theBE 102. As such, these extracted artifacts 140 are provided to the BE102, which processes the received artifacts 140 based on theconversation model 110 to determine suitable actions 142 (e.g., changinga password, creating a record, purchasing an item, closing an account)and/or virtual agent utterances 124 in response to the received userutterance 122. As indicated by the arrow 144, the process 145 cancontinuously repeat as the agent automation framework 100 receives andaddresses additional user utterances 122 from the same user and/or otherusers in a conversational format.

As illustrated in FIG. 5, it may be appreciated that, in certainsituations, no further action or communications may occur once thesuitable actions 142 have been performed. Additionally, it should benoted that, while the user utterance 122 and the agent utterance 124 arediscussed herein as being conveyed using a written conversational mediumor channel (e.g., chat, email, ticketing system, text messages, forumposts), in other embodiments, voice-to-text and/or text-to-voice modulesor plugins could be included to translate spoken user utterance 122 intotext and/or translate text-based agent utterance 124 into speech toenable a voice interactive system, in accordance with the presentdisclosure. Furthermore, in certain embodiments, both the user utterance122 and the virtual agent utterance 124 may be stored in the database106 (e.g., in the corpus of utterances 112) to enable continued learningof new structure and vocabulary within the agent automation framework100.

As mentioned, the NLU framework 104 includes two primary subsystems thatcooperate to convert the hard problem of NLU into a manageable searchproblem—namely: a meaning extraction subsystem and a meaning searchsubsystem. For example, FIG. 6 is a block diagram illustrating roles ofthe meaning extraction subsystem 150 and the meaning search subsystem152 of the NLU framework 104 within an embodiment of the agentautomation framework 100. For the illustrated embodiment, a right-handportion 154 of FIG. 6 illustrates the meaning extraction subsystem 150of the NLU framework 104 receiving the intent-entity model 108, whichincludes sample utterances 155 for each of the various artifacts of themodel. The meaning extraction subsystem 150 generates an understandingmodel 157 that includes meaning representations 158 (e.g., utterancetree structures) of the sample utterances 155 of the intent-entity model108. In other words, the understanding model 157 is a translated oraugmented version of the intent-entity model 108 that includes meaningrepresentations 158 to enable searching (e.g., comparison and matching)by the meaning search subsystem 152, as discussed in more detail below.As such, it may be appreciated that the right-hand portion 154 of FIG. 6is generally performed in advance of receiving the user utterance 122,such as on a routine, scheduled basis or in response to updates to theintent-entity model 108.

For the embodiment illustrated in FIG. 6, a left-hand portion 156illustrates the meaning extraction subsystem 150 also receiving andprocessing the user utterance 122 to generate an utterance meaning model160 having at least one meaning representation 162. As discussed ingreater detail below, these meaning representations 158 and 162 are datastructures having a form that captures the grammatical, syntacticstructure of an utterance, wherein subtrees of the data structuresinclude subtree vectors that encode the semantic meanings of portions ofthe utterance. As such, for a given utterance, a corresponding meaningrepresentation captures both syntactic and semantic meaning in a commonmeaning representation format that enables searching, comparison, andmatching by the meaning search subsystem 152, as discussed in greaterdetail below. Accordingly, the meaning representations 162 of theutterance meaning model 160 can be generally thought of like one or moresearch keys, while the meaning representations 158 of the understandingmodel 157 define a search space in which the search keys can be sought.Thus, the meaning search subsystem 152 searches the meaningrepresentations 158 of the understanding model 157 to locate one or moreartifacts that match the meaning representation 162 of the utterancemeaning model 160 as discussed below, thereby generating the extractedartifacts 140.

As an example of one of the meaning representations 158, 162 disclosedherein, FIG. 7 is a diagram illustrating an example of an utterance tree166 generated for an utterance. As should be understood, the utterancetree 166 is a data structure that is generated by the meaning extractionsubsystem 150 based on the user utterance 122, or alternatively, basedon one of the sample utterances 155. For the example illustrated in FIG.7, the utterance tree 166 is based on an example utterance, “I want togo to the store by the mall today to buy a blue, collared shirt andblack pants and also to return some defective batteries.” Theillustrated utterance tree 166 includes a set of nodes 202 (e.g., nodes202A, 202B, 202C, 202D, 202E, 202F, 202G, 202H, 202I, 202J, 202K, 202L,202M, 202N, and 202P) arranged in a tree structure, with each noderepresenting a particular word or phrase (e.g., token) of the exampleutterance. It may be noted that each of the nodes 202 may also bedescribed as representing a particular subtree of the utterance tree166, wherein a subtree can include one or more nodes 202.

The form or shape of the utterance tree 166 illustrated in FIG. 7 isdetermined by the meaning extraction subsystem 150 and represents thesyntactic, grammatical meaning of one understanding of the exampleutterance. More specifically, a prosody subsystem of the meaningextraction subsystem 150 breaks the utterance into intent segments,while a structure subsystem of the meaning extraction subsystem 150constructs the utterance tree 166 from these intent segments. Each ofthe nodes 202 store or reference a respective word vector (e.g., token)that is determined by the vocabulary subsystem to indicate the semanticmeaning of the particular word or phase of the utterance. As mentioned,each word vector is an ordered n-dimensional list (e.g., a 300dimensional list) of floating point values (e.g., a 1×N or an N×1matrix) that provides a mathematical representation of the semanticmeaning of a portion of an utterance.

Moreover, in other embodiments, each of the nodes 202 may be annotatedby the structure subsystem with additional information about the word orphrase represented by the node to form an annotated embodiment of theutterance tree 166. For example, each of the nodes 202 may include arespective tag, identifier, shading, or cross-hatching that isindicative of a class annotation of the respective node. In particular,for the example utterance tree 166 illustrated in FIG. 7, certainsubtrees or nodes (e.g., nodes 202A, 202B, 202C, and 202D) may beannotated with part-of-speech labels or tags to be verb nodes, certainsubtrees or nodes (e.g., nodes 202E, 202F, 202G, 202H, 202I, and 202J)may be annotated to be subject or object nodes, and certain subtrees ornodes (e.g., nodes 202K, 202L, 202M, 202N, and 202P) may be annotated tobe modifier nodes (e.g., subject modifier nodes, object modifier nodes,verb modifier nodes) by the structure subsystem. These class annotationsmay then be used by the meaning search subsystem 152 when comparingmeaning representations that are generated from annotated utterancetrees.

As such, it may be appreciated that the utterance tree 166, from whichthe meaning representations are generated, serves as a basis (e.g., aninitial basis) for artifact extraction. Further to this effect, thenodes 202 of certain embodiments of the utterance tree 166, such asthose in which the utterance is a sample utterance 155 of theintent-entity model 108, may also be annotated or tagged with respectiveartifact labels (e.g., intent labels and/or entity labels) that identifya particular one of the nodes as a particular entity that is definedwithin a particular intent. For example, during construction of theintent-entity model 108, an author of the intent-entity model 108 mayidentify the sample utterance 155 as belonging to a particular intent.Within the intent-entity model 108, the author may then identify (e.g.,highlight, annotate, label) certain tokens within the sample utterance155, such as particular entities, as entities that belong to or areassociated with the intent. For example, a “purchase product” intent mayhave a number of labeled entities within an intent-entity model 108 thatbelong to the intent, such as a “brand” entity, a “model” entity, a“color” entity, a “size” entity, a “shipping address” entity, and soforth, depending on the nature of the product. In some embodiments, atleast a portion of the sample utterances 155 are associated withartifact labels by machine learning features of the NLU framework,either in addition to or in alternative to the manually-specifiedannotations of an author. In either case, the artifact labels for thesample utterances 155 may be specific to the particular relationshipsdefined by the associated understanding model, as well as the underlyingintent-entity model, providing a structure that enables improvedinference within the scope of the particular intent-entity model 108.

Based on cognitive construction grammar techniques, the structuresubsystem may consider the artifact labeling of the sample utterances155 to guide determination of the shape of the utterance tree 166. Itshould be understood that the artifact labels applied to the tokens ofthe sample utterances 155 may be propagated to the meaningrepresentations 158, and in particular, are included as parametersassociated with the respective nodes of the meaning representations,where each node represents a token of the utterance. For example, in thepresent embodiment, the entirety of the sample utterance 155 of theutterance tree 166 may be identified as belonging to one (or multipleof) a desire intent, a travel intent, or a purchase intent, based on thepresence of the “want” of node 202A, the “to go” of node 202B, and the“to buy” of node 202C, and “to return” of node 202D. These intents maytherefore be leveraged within the structure of a particularintent-entity model 108 to identify meaning matches to meaningrepresentations within the intent categories.

Further, certain nodes of the utterance tree 166 may be labeled as anentity that is associated with a particular intent of the sampleutterance 155. For example, the tokens “store” of node 202F and “mall”of node 202G may include artifact labels to indicate that the tokens arelabeled entities that correspond to the purchase intent. The nodes 202F,202G may both be annotated as entities that represent locations withinthe travel intent. That is, each of these labeled entities furtherdefine their associated intent to enable the NLU framework 104 to derivemeaning and perspective from user utterances 122 related to theassociated intent. The artifact labels discussed herein are furtherleveraged by an artifact pinning subsystem, as discussed below, toverify whether certain cleansed forms and/or alternative forms ofmeaning representations generated for various sample utterances 155 havevalid form interpretations that uphold the embedded artifactrelationship information within a particular intent-entity model 108. Byexcluding interpretations that construe entities in manners unsuited to(e.g., different from, not in accordance with) the artifact labels ofthe sample utterances 155, the ambiguity of complex user or sampleutterances, such as polysemic utterances and/or utterances havingmulti-word entities that are interpretable in multiple ways, can bereduced or eliminated. It should be understood that embodiments of theutterance tree 166 that are generated from a received user utterance 122provided to the BE 102 may also include any other suitable tags that theBE 102 derives from the context of interaction with an end-user, such astags indicating that a particular user utterance 122 was received duringdiscussion of a particular intent, during a certain time of day, duringoccurrence of a particular news or weather event, and so forth.

FIG. 8 is an information flow diagram illustrating an embodiment of anartifact pinning subsystem 250 compiling a search space 252 frommultiple understanding models 157, and then implementing one or moresearch keys 254 against the search space 252 to identify the extractedartifacts 140. As acknowledged herein, to facilitate generation of theextracted artifacts 140, the artifact pinning subsystem 250 may expandeither or both the search keys 254 and the search space 252 and pin thesearch space 252 to target variations of below-discussed sets of themeaning representations 158. The meaning representations 158 of thesearch space 252 may be pinned in response to the artifact pinningsubsystem determining that the meaning representations 158 areformulated with valid configurations of artifact assignments therein,align with a current contextual intent of conversation between the userand the BE 102, or both.

Providing more detail herein with respect to generation of the searchspace 252, the artifact pinning subsystem 250 may aggregate the sampleutterances 155 of a set 270 of intent-entity models, such as multipleintent-entity models 108 that are each suited for a particular purposeor domain. As discussed above, the sample utterances 155 areindividually associated with artifact labels 272 that linkmodel-specific relationships between various intents and variousentities of the sample utterances 155. For example, each intent-entitymodel 108 of the set 270 may include sample utterances 155 that provideguidance for the NLU framework 104 to perform meaning searches withrespect to any suitable natural language interaction with users, such asgreeting users, managing meetings, managing a particular product of anenterprise, managing human resource actions, and/or concludingconversations with users, among many other suitable interactions. Thesample utterances 155 are analyzed by the meaning extraction subsystem150 of the artifact pinning subsystem 250 to generate a set 282 ofmeaning representations that assign possible forms to, as well asconsider polysemic expression of, each respective sample utterance 155.For the set 282 of meaning representations, a respective understandingmodel of a set 284 of understanding models may therefore be generated,wherein each understanding model of the set 284 defines a respectivemodel-specific search space 286.

As discussed below, the artifact pinning subsystem 250 desirably pinssuitable meaning representation candidates of the multiplemodel-specific search spaces 286 to compile the search space 252 (e.g.,compiled search space). In particular, for a given model-specific searchspace 286, the artifact pinning subsystem 250 identifies and pinssuitable candidates from the set 282 of meaning representations thatalign with the artifact labels 272 of the particular one of the set 270of intent-entity models from which the particular one of the set 284 ofunderstanding models was derived. It should be understood that theartifact pinning subsystem 250 may compile the search space 252 afterone or more conversations between a user and the BE 102, periodically,in response to receipt of new or updated sample utterances 155, and soforth.

Similarly, during search key generation and utilization, the artifactpinning subsystem 250 receives a user utterance 122 and derives a set290 of meaning representations for the user utterance 122 that assignspotential entities to tokens within the user utterance 122. Notably, theartifact pinning subsystem 250 may not prune the set 290 of meaningrepresentations because the search space 252 is already pruned tosurviving meaning representations 158 that uphold the artifactrelationships set forth by the artifact labels 272 of the set 270 ofintent-entity models. Additionally, the artifact pinning subsystem 250may further refine the set 282 of meaning representations of the set 270of intent-entity models based on a context of a conversation between theuser and the BE 102, such as a conversation to order a product orschedule a meeting. Thus, the artifact pinning subsystem 250 generatesthe tailored utterance meaning model 160 from the set 290 of meaningrepresentations as the search keys 254 for comparison to theparticularly context-aware search space 252. Indeed, as discussed inmore detail below, the meaning search subsystem 152 may compare themeaning representations of the set 290 defining the search keys 254 tothe meaning representations 158 of the search space 252 to identify anysuitable, matching meaning representations 158, which enable the NLUframework 104 to identify the extracted artifacts 140 therefrom. Themeaning search subsystem 152 may also score the matching meaningrepresentations 158 and/or the artifacts therein with an accompanyingconfidence level to facilitate appropriate agent responses 124 and/oractions 142 to the most likely extracted artifacts 140 from the meaningrepresentations 158. As will be understood, the disclosed embodiments ofthe model-based and context-specific expansion and pruning (e.g.,pinning) of the meaning representations performed by the artifactpinning subsystem 250 provide enhancements to the search space 252and/or the search keys 254 to facilitate efficient, relationship-awareidentification of the extracted artifacts 140 for improved meaningsearch quality.

FIG. 9 is an embodiment of an information flow diagram illustrating theartifact pinning subsystem 250 pinning entities within meaningrepresentations utilized to generate the search space 252. As recognizedherein, the artifact pinning subsystem 250 is designed to leverageembedded relationships between the artifacts of the sample utterances155 of the set 270 of intent-entity models 108 to derive a targetedsubset of meaning representations for meaning searches. In the presentembodiment, the artifact pinning subsystem 250 receives or considers oneof the sample utterances 155. As mentioned above and discussed in moredetail below, the artifact pinning subsystem 250 derives (e.g.,generates, extracts) multiple alternative form assignments for theutterance 300 via any suitable processes, including alternative parsestructure discovery, re-expressions, cleansing, vocabulary injection orsubstitution, and/or various part-of-speech assignments that are appliedto respective tokens associated with nodes of utterance trees of themeaning representations for the sample utterance 155, thereby generatinga set 310 of meaning representation candidates that corresponds to theset 282 of meaning representations derived from the sample utterances155 of one intent-entity model of the set 270 discussed above.

Generally, each meaning representation candidate of the set 310 may havea respective CCG form that is assigned based on the structure of thesample utterance 155, including the interdependencies of the intents andentities within each meaning representation candidate. The disclosedartifact pinning subsystem 250 performs entity pinning to enable themeaning search subsystem 152 to focus on meaning representationcandidates of the set 310 that align with the aforementioned relationalcues embedded within the associated intent-entity model 108 (e.g.,defined herein as meaning representations having valid entityformulations), thereby leveraging these embedded relationships to verifythe prescribed forms of the set 310 of meaning representationcandidates. That is, assuming that each meaning representation candidateof the set 310 is related to a particular intent, the artifact pinningsubsystem 250 pins one or multiple entities 312 (illustrated via filledcircles in the present embodiment) therein that have a valid entityformulation and that correspond with the labeled or annotated entitieswithin the set 270 of intent-entity models (defined by the artifactlabels 272). Other meaning representation candidates of set 310 that donot contain a valid entity formulation (e.g., that do not correspondwith the artifact labels 272) may therefore be disregarded as unviablecandidates, while the entity-pinned meaning representation candidates ofthe set 310 are retained (e.g., pinned). The entity pinning facultiesdiscussed herein may be performed with respect to the search space 252to leverage the artifact-level relationships embedded within the set 270of intent-entity models for structure-verification-based pruning.Indeed, with the search space 252 pruned and compiled to match theembedded relationships of the intent-entity model 108, the artifactpinning subsystem 250 enables the search keys 254 to be expanded, notpinned, and utilized as broader search components for improved inferencequality. However, in other embodiments, the search keys 254 may bepruned in a manner similar to that of the search space 252, but withrespect to potential, estimated artifact labels that are generated viaML techniques and contrasted to the artifact labels 272 of the sampleutterances 155.

As a particular example, for a sample utterance 155 of “Order [three-wayswitch],” the set 310 of meaning representation candidates may includeentity formulations and associated entity assignments (indicated asbrackets) that express the sample utterance 155 as a first meaningrepresentation for a first interpretation of the utterance as “I wouldlike to order one three-way switch” and a second meaning representationfor a second interpretation of the utterance as “I would like to orderthree and to switch my way.” Because the first interpretation includesthe particular labeled entity [three way switch], the artifact pinningsubsystem 250 may retain and pin the first meaning representation as avalid parse of the sample utterance 155 in light of the associatedintent-entity model 108. In contrast, because the second meaningrepresentation does not include the particular labeled entity of thesample utterance 155, the artifact pinning subsystem 250 may determinethat the second meaning representation is not a valid interpretation ofthe sample utterance 155 and remove or prune the second meaningrepresentation from the search space 252, thereby directing appropriateconsumption of utterances for improved understanding accuracy.

Moreover, with respect to meaning searches performed by comparing thesearch keys 254 to the search space 252, the artifact pinning subsystem250 may also leverage contextual information provided by the BE 102 toverify whether any of the meaning representation candidates of the set310 generated for the search space 252 align with the context ofconversation (e.g., a current or on-going conversation) between anend-user and the BE 102. For example, in response to receiving a userutterance 122 requesting to set up a meeting at a particular time, NLUframework 104 may perform a meaning search based on the user utterance122 to determine that the utterance corresponds to a “meeting setupintent” defined in a particular intent-entity model 108. The NLUframework 104 provides this intent, along with any entities identifiedin the utterance, to the BE 102, as discussed above with respect to FIG.5. In response, the BE 102 may execute a flow that is designed to handlethe various tasks involved in scheduling the meeting for the user,including slot-filling information to complete the task from theentities (e.g., participants, location, time, topics) received from theNLU framework 104, and prompting the user to respond with additionaldetails to fill remaining slots. With this in mind, it is presentlyrecognized that the NLU framework 104 can use information about thecurrent flow being executed by the BE 102 to provide context forinterpreting subsequent utterances from the user. For the above example,the BE 102 may provide context (e.g., information regarding the currentcontextual intent) to the artifact pinning subsystem 250 of the NLUframework 104 indicating that the slot-filling process for the meetingsetup intent is being performed. In response, the NLU framework 104 mayperform a meaning search of a subsequent user utterance 122 that focuseson the entities related to the meeting setup intent (e.g., meetingparticipants, meeting locations, meeting times). That is, using thecontextual intent information provided by the BE 102, the artifactpinning subsystem 250 of the NLU framework 104 may desirably narrow thesearch space 252 to particular meaning representation candidates havingentities and/or intents that correspond to the current flow beingexecuted by the BE 102.

As will be understood, these techniques thereby efficiently leverageimplicit clues derivable from the current conversation and/or providedby an author of the intent-entity model 108 to pin candidate meaningrepresentations as suitable search spaces 252 that provide an improvedmeaning search. It should be understood that although the artifactlabels 272 of FIG. 8 are primarily described with respect to an authorof the intent-entity model 108, the NLU framework 104 may additionallyor alternatively associate respective entities with respective intentswithin the sample utterances 155 via ML-based techniques, in someembodiments. The above-introduced entity-pinning may be furtherunderstood with respect to the following discussion, which relates togeneration of the search space 252 followed by performing a meaningsearch by comparing the search keys 254 to a context-aware embodiment ofthe search space 252.

FIG. 10 is a flow diagram illustrating an embodiment of a process 350whereby, during compilation of the search space 252, the artifactpinning subsystem 250 efficiently expands a number of meaningrepresentations generated for the sample utterances 155 of the set 270of intent-entity models, and then narrows the number via model-basedpinning based on respective entities of each meaning representationcorresponding to labeled entities of the intents of the set 270. Thepresent embodiment of the process 350 particularly illustratesgeneration of the model-specific search space 286 for one understandingmodel of the set 270 of FIG. 8, though it should be understood that eachmodel-specific search space 286 may be generated via the process 350 andaggregated into the search space 252 by any suitable data aggregationprocesses or structures. Additionally, the process 350 may be stored ina suitable memory (e.g., memory 86) and executed by a suitable processor(e.g., processor(s) 82) associated with the client instance 42 or theenterprise instance 125, as discussed above with respect to FIGS. 3, 4A,and 4B.

The artifact pinning subsystem 250 performing the illustrated embodimentof the process 350 begins with a vocabulary refinement phase 352 thatcleanses and validates (block 354) the sample utterances 155 storedwithin one intent-entity model of the set 270. For example, the artifactpinning subsystem 250 may utilize a vocabulary subsystem 356 of the NLUframework 104 (e.g., corresponding to the vocabulary manager 118 incertain embodiments) to access and apply the rules 114 stored in thedatabase 106 that modifies certain tokens (e.g., words, phrases,punctuation, emojis) of the sample utterances 155. In some embodiments,the vocabulary subsystem 356 performs the vocabulary refinement phase352 based on a vocabulary model 360 that is stored with theintent-entity model 108 within a particular understanding model of theset 282 or based on an aggregated vocabulary model derived from eachrespective vocabulary model of the set 282. By way of example, incertain embodiments, cleansing may involve applying a rule that removesnon-textual elements (e.g., emoticons, emojis, punctuation) from thesample utterances 155. In certain embodiments, cleansing may involvecorrecting misspellings or typographical errors in the sample utterances155. Additionally, in certain embodiments, cleansing may involvesubstituting certain tokens with other tokens. For example, thevocabulary subsystem 356 may apply a rule that that all entities withreferences to time or color with a generic or global entity, such as aglobal entity for a phone number, a time, a color, a meeting room, andso forth. In certain cases in which the intent-entity model 108 is apre-built vocabulary model, the artifact pinning subsystem 250 may omitcleansing, while proceeding with validation to ensure the sampleutterances 155 are valid sample utterances in view of validation rulesstored within the database 106.

Continuing the vocabulary refinement phase 352 of the process 350, theartifact pinning subsystem 250 then performs (block 362) vocabularyinjection on tokens of the sample utterances 155, thereby re-renderingthe sample utterances 155 by adjusting phraseology and/or terminology ofthe sample utterances 155. Based on the vocabulary model 360 (oraggregate vocabulary model) stored within the understanding model 157,the artifact pinning subsystem 250 utilizes the vocabulary subsystem 356to replace the content of certain tokens of the sample utterances 155 tobecome more discourse-appropriate phrases and/or terms. In certainembodiments, multiple phrases and/or terms may be replaced, and thevarious permutations of such replacements are used to generate a set 364of utterances (e.g., vocabulary-adjusted sample utterances) for eachsample utterance 155 of the particular intent-entity model 108. Forexample, in certain embodiments, the vocabulary subsystem 356 may accessthe vocabulary model 360 of the understanding model 157 to identifyalternative vocabulary that can be used to generate re-expressions ofthe utterances having different tokens. By way of specific example, inan embodiment, the vocabulary subsystem 356 may determine that a synonymfor “developer” is “employee,” and may generate a new utterance in whichthe term “developer” is substituted by the term “employee.”

For the embodiment illustrated in FIG. 10, after cleansing andvocabulary injection, the artifact pinning subsystem 250 removes (block366) vocabulary-level duplicates from the set 364 of utterances. Insituations in which identical utterances were generated for a givensample utterance 155, the vocabulary-level deduplication of block 366generally discards all but one utterance from the set 364, efficientlyconserving computing resources of the NLU framework 104. It may beappreciated that the set 364 of utterances may include originalembodiments of sample utterances 155 or cleansed version of the sampleutterances 155, and may include any suitable number of alternativere-expressions utterances generated through the vocabulary injection ofblock 362. It may be noted that, in certain circumstances, thevocabulary injection of block 362 may not generate re-expressions of thesample utterances 155, and as such, the set 364 of utterances may onlyinclude the sample utterances 155, or cleansed versions thereof. Inother embodiments, the sample utterances 155 may be provided directly toa structure subsystem 370 of the NLU framework 104 without the cleansing(e.g., in situations in which the intent-entity model 108 is a pre-builtmodel) or the vocabulary injection of block 366.

The artifact pinning subsystem 250 may subsequently direct remainingcandidates of the set 364 to the structure subsystem 370 of analternative form expansion phase 372 of the process 350 forpart-of-speech (POS) tagging and parsing. That is, in the illustratedembodiment, the artifact pinning subsystem 250 implements the structuresubsystem 370 to generate (block 374) the set 282 of one or more meaningrepresentations that are representative of the sample utterances 155 ofthe intent-entity model 108. For the embodiment illustrated in FIG. 10,after generating the set 282 of meaning representations, the artifactpinning subsystem 250 may remove (block 376) any artifact-levelduplicates across the entire set 282 of meaning representations. Thatis, because generation of the alternative forms of the meaningrepresentations may produce permutations of meaning representations forthe set 364 of utterances that overlap one another, the artifact pinningsubsystem 250 analyzes each artifact across the set 282 of meaningrepresentations to identify any meaning representations of the set 282that are unsuitably similar to other meaning representations. It shouldbe understood that for embodiments in which the search space 252 isgenerated from multiple intent-entity models of the set 270, theduplication removal of block 376 may be extended to remove duplicatesbetween multiple embodiments of the set 364 of meaning representations,thereby providing improved processing speed during subsequent meaningsearch processes. Moreover, it should be understood that theabove-discussed vocabulary refinement phase 352 and alternative formexpansion phase 372 are examples of certain processes the artifactpinning subsystem 250 may implement to generate the set 282 of meaningrepresentations, and that other suitable processes, includingalternative parse structure discovery, vocabulary substitution, and/orre-expressions, may be implemented in addition or in alternative to thephases 352, 372.

The artifact pinning subsystem 250 performing the illustrated embodimentof FIG. 10 may therefore analyze the set 282 of meaning representationsduring a model-based entity pinning phase 380 of the process 350 that isperformed with respect to each intent present within the set 282. Forexample, for each intent present within the set 282 of meaningrepresentations (block 382), the artifact pinning subsystem 250considers (block 384) each meaning representation of the set 282determined to have at least one particular labeled entity for therespective intent. Within this nested loop structure, the artifactpinning subsystem 250 may iteratively identify (block 386) a set 388 ofmeaning representations (for each respective intent) that include arespective entity, in a suitable entity form, that corresponds to theparticular labeled entity of the particular intent. By consulting theartifact labels 272 defined within the respective intent-entity model108, the artifact pinning subsystem 250 may verify whether these parsesare valid interpretations of the utterance within the structure definedby the given intent-entity model 108. The artifact pinning subsystem 250thus retains or pins (block 390) the set 388 of meaning representationsthat are valid formulations of the respective meaning representationhaving a particular intent, for each intent of the intent-entity model108. For embodiments simultaneously consolidating multiple intent-entitymodels 108, the model-based entity pinning phase 380 may be performedindividually (and in some embodiments, simultaneously or in parallel)for each respective intent-entity model 108, thereby testing thevalidity of each meaning representation of the set 282 against theartifact labels 272 most suitable for verifying the appropriateness ofthe respective parse and POS tagging of each meaning representation.

For example, given a sample utterance stating “Book meeting,” a firstmeaning representation of the set 282 of the meaning representations mayindicate a parse and POS tagging in which “book” is interpreted as averb-style intent (e.g., corresponding to a “schedule” intent) having“meeting” as an entity. As such, the first meaning representation of theset 282 may be interpreted as, “I want to schedule a meeting.” For asecond meaning representation of the set 282, the parse and POS taggingmay interpret “book” as a noun, such that the second meaningrepresentation is interpreted as, “I want a book meeting.” Assuming thata “schedule” intent of the intent-entity model 108 includes at least onesample utterance 155 with author-labeled entities defined therein, theartifact pinning subsystem 250 analyzes the artifact labels 272 todetermine whether “meeting” in the first meaning representationcorresponds to at least one appropriate labeled entity of the “schedule”intent of the intent-entity model 108. In response to determining that“meeting” in the first meaning representation corresponds to a “meeting”entity of the “schedule” intent within the intent-entity model 108, theartifact pinning subsystem 250 may pin the first meaning representationas a valid formulation of the sample utterance 155. In contrast, uponanalyzing a “desire” or “I want” intent of the intent-entity model 108when considering the second meaning representation of the set 282, theartifact pinning subsystem 250 may determine that there are no sampleutterances having a labeled “book meeting” entity defined for the“desire” intent within the intent-entity model 108. As such, theartifact pinning subsystem 250 may recognize that the second meaningrepresentation is an invalid or irrelevant parse of the particularsample utterance 155, at least with respect to the particular domain inwhich the intent-entity model 108 is defined.

With the set 388 of meaning representations pinned as having suitablestructure for each intent, the artifact pinning subsystem 250 mayperform a re-expression phase 392 of the process 350 in which theartifact pinning subsystem 250 determines (block 394) suitablere-expressions of the tokens of the meaning representations of the set388. For example, for the validated and number-reduced set 388 ofmeaning representations, the artifact pinning subsystem 250 may adjustthe order of tokens, add tokens, remove tokens, transform tokens,generalize tokens, or otherwise manipulate the expression of each tokenof each meaning representation of the set 310. In certain embodiments,the re-expression phase 392 of the process 350 is performed by the NLUframework 104 described in U.S. patent application Ser. No. 16/239,218,entitled, “TEMPLATED RULE-BASED DATA AUGMENTATION FOR INTENTEXTRACTION,” filed Jan. 3, 2019, which is incorporated by referenceherein in its entirety for all purposes.

The step of block 394 may generate multiple different meaningrepresentations for each meaning representation of the set 388, againexpanding the number of potential meaning representation candidates forinclusion in the search space 252. However, the artifact pinningsubsystem 250 of certain embodiments may again leverage the artifactlabels 272 to discard (block 396) certain meaning representations of there-expressed variants of the set 388 that do not include the respectivepinned entity therein. Then, the artifact pinning subsystem 250 removes(block 398) any artifact-level duplicates across the entire set 388 ofmeaning representations, as discussed above with respect to block 376.As such, the artifact pinning subsystem 250 may provide thesuitably-distinct meaning representations remaining within the set 388as meaningful candidates within a compiled understanding model 400,which corresponds to the compiled search space 252 introduced above. Insome embodiments, the NLU framework 104 having the artifact pinningsubsystem 250 may maintain the compiled search space 252 for a thresholdamount of time, for use for a threshold number of meaning searches, andso forth.

FIG. 11 is a flow diagram illustrating a process 450 whereby theartifact pinning subsystem 250 implements model-based entity pinning andBE context based intent-pinning to expand and proficiently narrow thesearch space 252 that is subsequently applied against the search key 254during a meaning search, as discussed below. For example, theillustrated embodiment of the process 450 begins with the artifactpinning subsystem 250 receiving the user utterance 122 and performing arefinement and expansion phase 452 to generate a set 454 of potentialmeaning representations for the user utterance 122. In certainembodiments, the refinement and expansion phase 452 is a sequentialcombination of the vocabulary refinement phase 352 and the alternativeform expansion phase 372 each discussed above with respect to FIG. 10.In such embodiments, the artifact pinning subsystem 250 may directoperation of the vocabulary subsystem 356 to perform (block 456)vocabulary refinement on the user utterance 122, then direct operationof the structure subsystem 370 to perform (block 458) alternative formexpansion on the variants of the user utterance 122 generated by thevocabulary subsystem 356. The set 454 of potential meaningrepresentations for the user utterance 122 produced via the refinementand expansion phase 452 may therefore express the user utterance 122 inmultiple vocabulary, parse, and POS variants. It should be understoodthat the refinement and expansion phase 452 may also include removal oftoken-level duplicates and artifact level duplicates from the set 454 ofpotential meaning representations, in some embodiments.

In other embodiments, the refinement and expansion phase 452 may bemodified to include any suitable processes that generate the set 454 ofpotential meaning representations having zero, one, or moresuitably-distinct variations of the respective user utterance 122, inaccordance with the present disclosure. For example, any one or multipleof vocabulary cleansing, vocabulary substitution, vocabulary injection,various part-of-speech assignments, alternative parse structurediscovery, and re-expressions may be performed to generate the set 454of potential meaning representations. As such, the artifact pinningsubsystem 250 may aggregate the suitably-distinct meaningrepresentations of each generated set 454 of potential meaningrepresentations as meaningful candidates within the utterance meaningmodel 160, thereby forming the utterance meaning model 160 as the one ormore search keys 254 suitable for efficient comparison to the searchspace 252.

Notably, it is presently recognized that the artifact pinning subsystem250 may leverage contextual intent information related to a currentconversation between the user and the BE 102 via a BE context intentpinning phase 470 that improves targeted refinement of the search space252. In the illustrated embodiment, the artifact pinning subsystem 250may identify (block 472) a contextual intent 474 from the context of thecurrent episode between the user and the BE 102. As examples, thecontextual intent 474 may be a purchase intent, a meeting setup intent,a travel intent, a desire intent, and so forth, based on the currentoperation of the BE 102 with respect to the dialog between the user andthe BE 102. In some embodiments, the BE 102 may provide or makeavailable to the NLU framework 104 the current contextual intent 474 ofthe BE 102, which may be utilized to tailor operation of the NLUframework 104 for subsequent intent inference operations. For example,in certain embodiments, the contextual intent 474 may be in the form ofan indication of a flow (e.g., script, process) that is currently beingexecuted by the BE 102 in response to a previously received userutterance 122, wherein the flow corresponds to a particular intent(e.g., a purchase intent, a schedule meeting intent) that was inferencedby the NLU framework 104 from the previously received user utterance122. In other embodiments, the artifact pinning subsystem 250 maydetermine the contextual intent 474 from previous episodes, or from anysuitable context information associated with an episode of aconversation that the BE 102 may use to determine suitable actions inresponse to extracted intents and/or entities of the user utterance 122.Moreover, in certain embodiments, the artifact pinning subsystem 250and/or the BE 102 may identify that a slot-filling operation orconversation is to be performed to gather additional information orentities associated with the contextual intent 474.

Based on the contextual intent 474 of the BE 102, the artifact pinningsubsystem 250 may aggregate (block 476) or identify relevant meaningrepresentations 480 of the meaning representations 382 of the compiledunderstanding model 400 that are associated with (e.g., include) thecontextual intent 474. It should be understood that the compiledunderstanding model 400 may have been previously generated via themodel-based entity pinning phase 380 discussed above with respect toFIG. 10, such that the meaning representations 382 therein align withthe inherent cues embedded within the artifact labels of the compiledunderstanding model 400. As such, in certain embodiments, the BE contextintent pinning phase 470 is performed on the compiled understandingmodel 400 during inference to further tune the search space 252 based onthe current contextual intent 474 of conversation. In other embodiments,the BE context intent pinning phase 470 may be performed concurrentlywith the model-based entity pinning phase 380, before the model-basedentity pinning phase 380, or alternatively, the model-based entitypinning phase 380 may be omitted.

With the relevant meaning representations 480 identified, the artifactpinning subsystem 250 may pin (block 482) or intent-pin the relevantmeaning representations 480 within the search space 252. The artifactpinning subsystem 250 may therefore pin the meaning search to particularmeaning representation candidates that relate to the contextual intent474, therefore efficiently leveraging available information to improvehow the user may interface with the BE 102. In certain embodiments, thepinning performed by the artifact pinning subsystem 250 at block 482enables the agent automation system 100 to leverage the contextualintent 474 to soft-pin (e.g., identify and retain) particular subsets ofthe compiled understanding model 400 for particularly relevant meaningsearches. The relevant meaning representations 480 identified within thecompiled understanding model 400 may be provided a scoring bonus duringsuch searching processes, thereby soft-pinning the generated searchspace 252 to provide more direct search paths for the meaning searches.The scoring bonus provided to the relevant meaning representations 480(or scoring penalty provided to irrelevant meaning representations) maybe any suitable increase above a threshold score. Thus, the artifactpinning subsystem 250 may increase the contribution of similarity scoresdetermined for meaning representations within the search space 252 thatmatch the contextual intent 474 (e.g., current topic of discussion) toimprove similarity scoring processes, imparting a preference to thecontextual intent 474 based on a particular conversational frame ofreference. By influencing similarity scores (e.g., soft-pinning) insteadof pruning the search space 252 based on the contextual intent 474, theartifact pinning subsystem 250 is designed to enable the agentautomation system 100 to correctly interpret user utterances that changethe topic of conversation, such as by enabling other intents to remainwithin meaning search results and potentially be identified as meaningsearch matches to the changed topic.

In other embodiments, the artifact pinning subsystem 250 executes thepinning of block 482 as hard-pinning, in which meaning representations382 of the compiled understanding model 400 that are not associated withcontextual intent 474 are removed from consideration (e.g., pruned). Theartifact pinning subsystem 250 may therefore generate a restrictive orconstrained embodiment of the search space 252 against which the searchkeys 254 are compared to identify the extracted artifacts 240, directingsearching and similarity scoring resources to the remaining relevantmeaning representations 480 of the complied understanding model 400. Itshould be understood that the artifact pinning subsystem 250 may beindividually tailored to perform the BE context intent pinning phase 470via any suitable combination of soft-pinning or hard-pinning. Forexample, the artifact pinning subsystem 250 may utilize soft-pinning inresponse to determining that the contextual intent 474 is relativelyuncommon, is associated with an average episode duration that is below athreshold duration, or has any other suitable quality indicative of anincreased likelihood of imminent topic change. In embodiments in whichthe contextual intent 474 is not associated with an increased likelihoodof imminent topic change, the artifact pinning subsystem 250 may utilizehard-pinning. In any case, it should be understood that the process 450may be repeated in response to subsequent user utterances 122 until thecontextual intent 474 is satisfied (e.g., all sub-branches of aslot-filling operation are filled).

The BE context intent pinning phase 470 may be better understood withrespect to an example embodiment of a particular slot-filling operationof BE 102 in which the user has already indicated a particular intent topurchase an item. Within the purchase intent or context identified as aparticular contextual intent 474 (e.g., target intent, intent identifiedby BE 102 from previous user utterance 122), the BE 102 may respond byprompting the user for additional information related to the purchaseintent, and the artifact pinning subsystem 250 may analyzesubsequently-received user utterances by targeting (e.g.,intent-pinning) the entities as likely being slot-filling responsesassociated with the contextual intent 474. In such embodiments, the BE102 progresses through an episode of conversation that requests entitiesfrom the user that are missing and important for satisfying thecontextual intent 474, without redundantly requesting information thatthe user has already provided.

As another specific example, in response to receiving a particular userutterance 122 of “I want to schedule a meeting at 10:00 am tomorrow,”the NLU framework 104 may identify that the contextual intent 474 is a“schedule meeting” intent that is associated with a “meeting start time”entity within the intent-entity model 108. As such, the BE 102 mayprovide an agent response to request that the user provide entities formeeting participants, a meeting subject, and/or a meeting end time.Because the NLU framework 104 is performing the BE context intentpinning phase 470, the search space 252 for meaning search may bedesirably narrowed to or directed toward the meeting setup intent,thereby improving agent responses by discrediting or disregardingmeaning representations 382 that are not associated with the meetingsetup intent.

In other words, the artifact pinning subsystem 250 may recognize that aparticular intent is current being discussed or was being previouslydiscussed during the course of conversation based on input from the BE102. As such, when the user has already requested scheduling of ameeting, the NLU framework 104 may receive a subsequent user utterance122 indicating a meeting end time, which is identified as including apotential entity without any potential intent (e.g., a noun phrase). Tobetter analyze the user utterance 122, the artifact pinning subsystem250 desirably associates the contextual intent 474 from with the entityof the user utterance 122 and boosts the confidence level (e.g., above athreshold confidence level) to encourage or force intent-specific entitymatching, while permitting potential topic changes. This associationeffectively operates as a contextual intent pinning to facilitatehigh-quality and context-relevant meaning searches.

Technical effects of the present disclosure include providing an agentautomation framework that implements an artifact pinning subsystem thatcontrols operation of the meaning extraction subsystem and the meaningsearch subsystem to expand and subsequently narrow a search space usedduring intent inference. During generation of the search space, theartifact pinning subsystem may determine multiple differentunderstandings of sample utterances within one or multiple intent-entitymodels by performing vocabulary adjustment and varied part-of-speechassignment to each sample utterance, thus generating apotentially-sizeable quantity of candidates for inclusion within thesearch space. To prune the candidates, the disclosed artifact pinningsubsystem leverages artifact correlations of the sample utterances toprune meaning representations that are not valid representations of aparticular sample utterance. That is, the sample utterances generallyeach belong to an intent that may have been labeled (e.g., by an authoror ML-based annotation subsystem) with a particular entity, within thestructure defined by the related intent-entity model. The artifactpinning subsystem analyzing the various meaning representations for theidentified intent may therefore identify a set of meaningrepresentations that are associated with the particular intent andinclude a respective entity corresponding to the labeled entity of acorresponding sample utterance. The set of meaning representations maythen be re-expressed by altering the arrangement or included number ofnodes of meaning representations associated with the set, remove anyduplicate candidates, and generate the search space based on theremaining meaning representations with the appropriate pinned entity.During a meaning search based on a received utterance from a user, thesearch keys may be formed by generating multiple potential meaningrepresentations the received utterance from a user. The search space mayalso be generated or refined during the meaning search with respect toan inferenced, contextual intent or current topic of conversationbetween the user and a behavior engine, guiding further targeted pruningof the search space. As such, the artifact pinning subsystem improves aquality of meaning searches by removing or assigning similarity scoresbelow a threshold to unviable and irrelevant meaning representationsduring search space compilation and inference processes.

The specific embodiments described above have been shown by way ofexample, and it should be understood that these embodiments may besusceptible to various modifications and alternative forms. It should befurther understood that the claims are not intended to be limited to theparticular forms disclosed, but rather to cover all modifications,equivalents, and alternatives falling within the spirit and scope ofthis disclosure.

The techniques presented and claimed herein are referenced and appliedto material objects and concrete examples of a practical nature thatdemonstrably improve the present technical field and, as such, are notabstract, intangible or purely theoretical. Further, if any claimsappended to the end of this specification contain one or more elementsdesignated as “means for [perform]ing [a function] . . . ” or “step for[perform]ing [a function] . . . ”, it is intended that such elements areto be interpreted under 35 U.S.C. 112(f). However, for any claimscontaining elements designated in any other manner, it is intended thatsuch elements are not to be interpreted under 35 U.S.C. 112(f).

What is claimed is:
 1. An agent automation system, comprising: a memoryconfigured to store a natural language understanding (NLU) frameworkincluding a meaning search subsystem; and a processor configured toexecute instructions to cause the meaning search subsystem of the NLUframework to perform actions comprising: receiving a plurality of samplemeaning representations, wherein one or more labeled meaningrepresentations of the plurality of sample meaning representationscomprise a particular intent and a labeled entity that is associatedwith the particular intent; for each labeled meaning representation ofthe one or more labeled meaning representations, pinning a set ofmeaning representations from the plurality of sample meaningrepresentations, wherein each meaning representation of the set ofmeaning representations comprises the particular intent and a respectiveentity corresponding to the labeled entity of the labeled meaningrepresentation; and generating a search space based at least in part oneach set of meaning representations corresponding to each labeledmeaning representation, wherein the meaning search subsystem isconfigured to compare a search key meaning representation to searchspace meaning representations to identify one or more of the searchspace meaning representations as matches for the search key meaningrepresentation.
 2. The agent automation system of claim 1, wherein theinstructions are configured to cause the meaning search subsystem toperform actions comprising: receiving a second plurality of samplemeaning representations comprising one or more second labeled meaningrepresentations, wherein one or more second labeled meaningrepresentations of the second plurality of sample meaningrepresentations comprise a particular artifact and a labeled artifactthat is associated with the particular artifact; and for each secondlabeled meaning representation of the second plurality of sample meaningrepresentations, pinning a second set of meaning representations fromthe second plurality of sample meaning representations, wherein eachmeaning representation of the second set of meaning representationscomprises the particular artifact and a respective artifactcorresponding to the labeled artifact of the second labeled meaningrepresentation; and generating the search space based at least in parton each set of meaning representations corresponding to each labeledmeaning representation and each second set of meaning representationscorresponding to each second labeled meaning representation.
 3. Theagent automation system of claim 2, wherein the particular artifact is asecond particular intent, and wherein the labeled artifact is a secondlabeled entity that is associated with the second particular intent ofthe respective second labeled meaning representation.
 4. The agentautomation system of claim 2, wherein the plurality of sample meaningrepresentations and the second plurality of sample meaningrepresentations are received from a meaning extraction subsystem of theNLU framework.
 5. The agent automation system of claim 1, wherein theplurality of sample meaning representations are received from a meaningextraction subsystem of the NLU framework, wherein the meaningextraction subsystem generated the plurality of sample meaningrepresentations from an intent-entity model comprising a plurality ofsample utterances, and wherein the plurality of sample utterancescomprise a set of intents that is associated with a set of entities. 6.The agent automation system of claim 5, wherein the labeled entity isassociated with the particular intent by an author of the intent-entitymodel, and wherein a parameter indicative of the association is storedwithin the intent-entity model.
 7. The agent automation system of claim1, wherein the instructions are configured to cause the meaning searchsubsystem to perform actions comprising: before generating the searchspace, for each labeled meaning representation of the one or morelabeled meaning representations: discarding a remaining set of meaningrepresentations of the plurality of sample meaning representations,wherein the remaining set of meaning representations do not comprise theparticular intent and a respective entity corresponding to the labeledentity.
 8. The agent automation system of claim 1, wherein theinstructions are configured to cause the meaning search subsystem toperform actions comprising: before generating the search space,re-expressing the set of meaning representations into a set ofre-expressed meaning representations by rearranging, adding, removing,and/or transforming tokens represented by respective nodes of the set ofmeaning representations; and disregarding any re-expressed meaningrepresentations of the set of re-expressed meaning representations thatdo not respectively comprise the particular intent and the labeledentity.
 9. The agent automation system of claim 8, wherein theinstructions are configured to cause the meaning search subsystem toperform actions comprising: removing intent-level duplicates,entity-level duplicates, or a combination thereof from the set ofre-expressed meaning representations before generating the search space.10. The agent automation system of claim 1, wherein the instructions areconfigured to cause the meaning search subsystem to perform actionscomprising: removing intent-level duplicates, entity-level duplicates,or a combination thereof from the plurality of sample meaningrepresentations before identifying the set of meaning representationsfrom the plurality of sample meaning representations.
 11. The agentautomation system of claim 1, wherein the plurality of sample meaningrepresentations each comprise respective nodes representing tokens of arespective sample utterance, wherein a first portion of the respectivenodes of each labeled meaning representation are tagged with theparticular intent, and wherein a second portion of the respective nodesof each labeled meaning representation are tagged with the labeledentity.
 12. The agent automation system of claim 1, wherein theinstructions are configured to cause the meaning search subsystem toperform actions comprising: receiving the search key meaningrepresentation; and comparing the search key meaning representation tothe search space meaning representations to identify the one or moresearch space meaning representations that are matches for the search keymeaning representation.
 13. A method of operating an agent automationsystem, comprising: determining a contextual intent of a dialog betweena user and a behavior engine of the agent automation system; generatinga meaning representation of a received user utterance of the dialog as asearch key; refining a search space that comprises meaningrepresentations of sample utterances of an intent-entity model, whereinrefining comprises selecting a portion of the meaning representations ofthe search space that include the contextual intent to yield a refinedsearch space; and comparing the search key to the refined search spaceto identify at least one matching meaning representation from themeaning representations of the refined search space.
 14. The method ofclaim 13, wherein determining the contextual intent comprisesdetermining an intent that corresponds to a flow being executed by thebehavior engine with respect to the dialog with the user.
 15. The methodof claim 13, wherein the method comprises, before refining the searchspace, pruning irrelevant meaning representations from the search spacethat do not include the contextual intent.
 16. A non-transitory,computer-readable medium storing instructions that, when executed by oneor more processors of an agent automation system, cause the agentautomation system to implement a meaning search subsystem to: receive aplurality of meaning representations, wherein one or more labeledmeaning representations of the plurality of meaning representationscomprise a particular intent and a particular entity that is associatedwith the particular intent; for each labeled meaning representation ofthe one or more labeled meaning representations, pin a set of meaningrepresentations from the plurality of meaning representations, whereineach meaning representation of the set of meaning representationscomprises the particular intent and the particular entity; and generatea search space based at least in part on each set of meaningrepresentations that corresponds to each labeled meaning representation.17. The non-transitory, computer-readable medium of claim 16, whereinthe search space is based at least in part on each set of meaningrepresentations that correspond to each labeled meaning representation.18. The non-transitory, computer-readable medium of claim 17, whereinthe particular entity of each labeled meaning representation wasassociated with the particular intent of an intent-entity model, andwherein the intent-entity model is utilized to determine the pluralityof meaning representations.
 19. The non-transitory, computer-readablemedium of claim 16, wherein the instructions cause the agent automationsystem to implement the meaning search subsystem to: re-express the setof meaning representations into a set of re-expressed meaningrepresentations by rearranging, adding, removing, and/or transformingtokens represented by respective nodes of the set of meaningrepresentations; and generate the search space based on eachre-expressed meaning representation of the set of re-expressed meaningrepresentations.
 20. The non-transitory, computer-readable medium ofclaim 16, wherein the instructions cause the agent automation system toimplement the meaning search subsystem to disregard any re-expressedmeaning representations of the set of re-expressed meaningrepresentations that do not respectively comprise the particular intentand the particular entity, before generating the search space.